Mining Distributed Private Databases Using Random Response Protocols

نویسندگان

  • Li Xiong
  • Pawel Jurczyk
  • Ling Liu
چکیده

There is a growing demand for sharing data repositories that often contain personal information across multiple autonomous, possibly untrusted, and private databases. This paper discusses constraints imposed by individual privacy as well as institutional data confidentiality on data mining across multiple databases and presents our initial solutions. We develop a suite of decentralized protocols that aim to effectively anonymize the data for each individual database and compute the query results across databases in a probabilistically secure manner. By relaxing the privacy constraints and accuracy requirement, the protocols achieve efficiency and scalability not offered by traditional multiparty secure computation approaches. Our primary viewpoint is that some approximation is tolerable and even desirable for scalable and robust mining across large, multiparty distributed environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy-Preserving Mining of Association Rules on Distributed Databases

Data mining techniques can extract hidden but useful information from large databases. Most efficient approaches for mining distributed databases suppose that all of the data at each site can be shared. However, source transaction databases usually include very sensitive information. In order to obtain an accurate mining result on distributed databases and to preserve the private data that is a...

متن کامل

Provably secure and efficient identity-based key agreement protocol for independent PKGs using ECC

Key agreement protocols are essential for secure communications in open and distributed environments. Recently, identity-based key agreement protocols have been increasingly researched because of the simplicity of public key management. The basic idea behind an identity-based cryptosystem is that a public key is the identity (an arbitrary string) of a user, and the corresponding private key is ...

متن کامل

Privacy Preserving CART Algorithm over Vertically Partitioned Data

Data mining classification algorithms are centralized algorithm and works on centralized database. In this information age, organizations uses distributed database. Since data mining of private data is one of the keys to success for an organization, it is a challenging task to implement data mining in distributed database. Collaboration of different organization brings mutual benefits to the pa...

متن کامل

Privacy Preserving k-Means Clustering in Multi-Party Environment

Extracting meaningful and valuable knowledge from databases is often done by various data mining algorithms. Nowadays, databases are distributed among two or more parties because of different reasons such as physical and geographical restrictions and the most important issue is privacy. Related data is normally maintained by more than one organization, each of which wants to keep its individual...

متن کامل

Honest-Verifier Private Disjointness Testing Without Random Oracles

This paper presents an efficient construction of a private disjointness testing protocol that is secure against malicious provers and honest-but-curious (semi-honest) verifiers, without the use of random oracles. In a completely semi-honest setting, this construction implements a private intersection cardinality protocol. We formally define both private intersection cardinality and private disj...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007